Title: Introduction to Soccer Pass Network Analysis with Python
Author: Indranil Ghosh
Institute: School of Fundamental Sciences, Massey University
Twitter: @indraghosh314
Website: https://indrag49.github.io/
Date: 31-07-2021
This talk teaches four simple concepts to those who want to start working on football data analysis:
How to get open access event data from statsbomb using statsbombpy
,
How to draw a soccer pitch using mplsoccer,
How to visualize a pass network for a particular team in a particular match, and
How to use NetworkX module to analyze the pass network.
statsbombpy
¶pip
to install statsbombpy
by using the following command:pip install statsbombpy
The open data from Statsbomb can be accessed without any need of authentication from the user but it is always advised to go through the Terms & Conditions section stated at their documentation page.
statsbombpy
package.from statsbombpy import sb
numpy
and the pandas
packages that help us manipulate our datasets and perform analyses like data cleaning and data extraction.import numpy as np
import pandas as pd
comp = sb.competitions()
credentials were not supplied. open data access only
comp
look like this:comp.head(15)
competition_id | season_id | country_name | competition_name | competition_gender | season_name | match_updated | match_available | |
---|---|---|---|---|---|---|---|---|
0 | 16 | 4 | Europe | Champions League | male | 2018/2019 | 2021-04-19T17:36:05.724116 | 2021-04-19T17:36:05.724116 |
1 | 16 | 1 | Europe | Champions League | male | 2017/2018 | 2021-01-23T21:55:30.425330 | 2021-01-23T21:55:30.425330 |
2 | 16 | 2 | Europe | Champions League | male | 2016/2017 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
3 | 16 | 27 | Europe | Champions League | male | 2015/2016 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
4 | 16 | 26 | Europe | Champions League | male | 2014/2015 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
5 | 16 | 25 | Europe | Champions League | male | 2013/2014 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
6 | 16 | 24 | Europe | Champions League | male | 2012/2013 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
7 | 16 | 23 | Europe | Champions League | male | 2011/2012 | 2020-08-26T12:33:15.869622 | 2020-07-29T05:00 |
8 | 16 | 22 | Europe | Champions League | male | 2010/2011 | 2020-07-29T05:00 | 2020-07-29T05:00 |
9 | 16 | 21 | Europe | Champions League | male | 2009/2010 | 2020-07-29T05:00 | 2020-07-29T05:00 |
10 | 16 | 41 | Europe | Champions League | male | 2008/2009 | 2020-08-30T10:18:39.435424 | 2020-08-30T10:18:39.435424 |
11 | 16 | 39 | Europe | Champions League | male | 2006/2007 | 2021-03-31T04:18:30.437060 | 2021-03-31T04:18:30.437060 |
12 | 16 | 37 | Europe | Champions League | male | 2004/2005 | 2021-04-01T06:18:57.459032 | 2021-04-01T06:18:57.459032 |
13 | 16 | 44 | Europe | Champions League | male | 2003/2004 | 2021-04-01T00:34:59.472485 | 2021-04-01T00:34:59.472485 |
14 | 16 | 76 | Europe | Champions League | male | 1999/2000 | 2020-07-29T05:00 | 2020-07-29T05:00 |
comp
to understand the dataset better and draw out relevant information from the same. Type the following:print(comp.columns)
Index(['competition_id', 'season_id', 'country_name', 'competition_name', 'competition_gender', 'season_name', 'match_updated', 'match_available'], dtype='object')
comp
dataset. For example, if we look into the row where the competition_id
is 16
and the season_id
is 1
, we notice that the country_name
is Europe
, the competition_name
is Champions League
, the season_name
is 2017/2018
, and so on. Suppose we are satisfied with the above information, and we want to analyze a game from 1017/18's Champions League season. We keep note of the competition_id
and season_id
at that row, which are 16
and 1
respectively. Now we extract out the matches dataset by typing the following:mat = sb.matches(competition_id = 16, season_id = 1)
credentials were not supplied. open data access only
mat
looks like this:mat
match_id | match_date | kick_off | competition | season | home_team | away_team | home_score | away_score | match_status | match_status_360 | last_updated | last_updated_360 | match_week | competition_stage | stadium | referee | data_version | shot_fidelity_version | xy_fidelity_version | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 18245 | 2018-05-26 | 20:45:00.000 | Europe - Champions League | 2017/2018 | Real Madrid | Liverpool | 3 | 1 | available | unscheduled | 2021-01-23T21:55:30.425330 | None | 7 | Final | NSK Olimpijs'kyj | M. Mažić | 1.1.0 | 2 | 2 |
mat
dataset gives us the match ids, the match dates, the kick off times, the home and away teams, the scores in a particular match, the name of the referee who officiated the match and so on. Here match_id
is the unique id that will help us draw out event data for a particular match from 2017/18's Champion's League season. Let us get the event data from a match. We see there is only one match available, with match_id = 18245
, which was the Champions League final match between Real Madrid and Liverpool ⚽ that took place at the Olimpiyskiy National Sports Complex, Moscow stadium and it ended up 3-1 in Real Madrid's favor 👀 👀 👀 👀. A great feat to be honest! Let us obtain the event data for this match.events = sb.events(match_id = 18245)
credentials were not supplied. open data access only
events
fetching us the event data for the particular match looks like this:events
50_50 | ball_receipt_outcome | ball_recovery_recovery_failure | block_offensive | carry_end_location | clearance_aerial_won | clearance_body_part | clearance_head | clearance_left_foot | clearance_right_foot | ... | shot_statsbomb_xg | shot_technique | shot_type | substitution_outcome | substitution_replacement | tactics | team | timestamp | type | under_pressure | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | {'formation': 41212, 'lineup': [{'player': {'i... | Real Madrid | 00:00:00.000 | Starting XI | NaN |
1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | {'formation': 433, 'lineup': [{'player': {'id'... | Liverpool | 00:00:00.000 | Starting XI | NaN |
2 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Real Madrid | 00:00:00.000 | Half Start | NaN |
3 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Liverpool | 00:00:00.000 | Half Start | NaN |
4 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Liverpool | 00:00:00.000 | Half Start | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
3492 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Real Madrid | 00:42:21.211 | Offside | NaN |
3493 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Real Madrid | 00:48:31.725 | Half End | NaN |
3494 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Liverpool | 00:48:31.725 | Half End | NaN |
3495 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Liverpool | 00:48:02.893 | Half End | NaN |
3496 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | NaN | NaN | NaN | NaN | NaN | NaN | Real Madrid | 00:48:02.893 | Half End | NaN |
3497 rows × 86 columns
print(events.columns)
Index(['50_50', 'ball_receipt_outcome', 'ball_recovery_recovery_failure', 'block_offensive', 'carry_end_location', 'clearance_aerial_won', 'clearance_body_part', 'clearance_head', 'clearance_left_foot', 'clearance_right_foot', 'counterpress', 'dribble_nutmeg', 'dribble_outcome', 'dribble_overrun', 'duel_outcome', 'duel_type', 'duration', 'foul_committed_advantage', 'foul_committed_card', 'foul_committed_type', 'foul_won_advantage', 'foul_won_defensive', 'goalkeeper_body_part', 'goalkeeper_end_location', 'goalkeeper_outcome', 'goalkeeper_position', 'goalkeeper_punched_out', 'goalkeeper_technique', 'goalkeeper_type', 'id', 'index', 'injury_stoppage_in_chain', 'interception_outcome', 'location', 'match_id', 'minute', 'off_camera', 'out', 'pass_aerial_won', 'pass_angle', 'pass_assisted_shot_id', 'pass_body_part', 'pass_cross', 'pass_cut_back', 'pass_end_location', 'pass_goal_assist', 'pass_height', 'pass_inswinging', 'pass_length', 'pass_miscommunication', 'pass_outcome', 'pass_outswinging', 'pass_recipient', 'pass_shot_assist', 'pass_straight', 'pass_switch', 'pass_technique', 'pass_through_ball', 'pass_type', 'period', 'play_pattern', 'player', 'position', 'possession', 'possession_team', 'related_events', 'second', 'shot_aerial_won', 'shot_body_part', 'shot_end_location', 'shot_first_time', 'shot_freeze_frame', 'shot_key_pass_id', 'shot_one_on_one', 'shot_outcome', 'shot_redirect', 'shot_statsbomb_xg', 'shot_technique', 'shot_type', 'substitution_outcome', 'substitution_replacement', 'tactics', 'team', 'timestamp', 'type', 'under_pressure'], dtype='object')
mplsoccer
.If you do not want to recreate a football pitch manually using Python (which would be rather tedious) you can simply use the mplsoccer module without any concern. To my knowledge it provides with the best functionalities to draw a football pitch. This package is maintained by Anmol Durgapal and Andrew Rowlinson.
Keep in mind you can do a lot more advanced visualization stuffs using mplsoccer besides drawing a football pitch. We will encounter them as we move forward with other posts later. For now let us focus on visualizing a pitch in the simplest way possible. We need to pip
install the package first:
pip install mplsoccer
mplsoccer
uses Python 3.6+. Next we need to import matplotlib
and the Pitch
classes. import matplotlib.pyplot as plt
from mplsoccer.pitch import Pitch
pitch = Pitch(pitch_color = 'grass', line_color = 'white', stripe = True, constrained_layout = True,
tight_layout = False, goal_type = 'box', label = True, axis = True, tick = True)
fig, ax = pitch.draw()
plt.show()
pitch_color
argument to 'grass'
giving an impression of a real life football pitch. Note that any other color can be set, for example, 'black'
or any color represented by its hex code. Discarding the stripe
argument removes the darker stripes that appear on the pitch. The line_color
is self-explanatory and the user can change its color too according to their need. By default, the axis, labels and the ticks representing the scales are switched off. The user can turn it on by setting label
, axis
and tick
arguments to be True
, as evident in the above pitch. Let us draw a different pitch with its color changed and stripes removed.pitch = Pitch(pitch_color='black', line_color = 'white', constrained_layout = True,
tight_layout = False, goal_type = 'box', label = True, axis = True, tick = True)
fig, ax = pitch.draw()
plt.show()
Now let us focus on the axis range for a moment. By default the Pitch()
function sets the pitch type to be statsbomb
where the y-axis is inverted and ranges from 80
to 0
. The x-axis ranges from 0
to 120
. We will be mostly working with statsbomb data, so, these orientations of the axes won't be of much concern. Nevertheless this information is way too useful and we must keep this in mind, in case we deal with football data from other sources.
To be precise, there are eight different pitch types that mplsoccer
provides us with. They are 'statsbomb'
, 'opta'
, 'tracab'
, 'skillcorner'
, 'wyscout'
,'metricasports'
, 'uefa'
, and 'custom'
. This can be set using the pitch_type
argument inside the Pitch()
function. Let us check the orientation of the uefa
pitch type:
pitch = Pitch(pitch_color='grass', stripe = True, pitch_type = 'uefa', line_color = 'white', constrained_layout = True,
tight_layout = False, goal_type = 'box', label = True, axis = True, tick = True)
fig, ax = pitch.draw()
plt.show()
orientation
and set it to 'vertical'
.pitch = Pitch(orientation = 'vertical', pitch_color = 'grass', line_color = 'white', stripe = True, constrained_layout = True,
tight_layout = False, goal_type = 'box')
fig, ax = pitch.draw()
plt.show()
view
argument to be 'half'
.pitch = Pitch(view = 'half', pitch_color = 'grass', line_color = 'white', stripe = True, constrained_layout = True,
tight_layout = False, goal_type = 'box')
fig, ax = pitch.draw()
plt.show()
mplsoccer
. The pitches can be further customized to meet the users' visualization needs. Keep an eye on the mplsoccer
documentation to learn more about the same. In the next section, we will learn how to visualize a pass network for a particular team from a match and analyze the network with the help of NetworkX Python package. This package will help us use basic concepts from complex network analysis literature to analyze the network and deduce some interesting properties from the same.pip install networkx
networkx
:import networkx as nx
pip
install the seaborn
package which is a Python package built on matplotlib
and is used for generating informative and appealing statistical graphs for analysis purposes. pip install seaborn
seaborn
tooimport seaborn as sns
events
dataset, we notice that there is a column named tactics
that provides us with team lineups, formations, player ids and their jersey number from both the teams. The corresponding row values for column type
gives us an idea about whether it was the starting 11 formation or was a tactical shift or any other developments in the teams. Let us generate a completely new dataset only focusing on the tactics
and the type
columns. We will filter the data in such a way that the tactics
column has no rows set to nan
.tact = events[events['tactics'].isnull() == False]
tact = tact[['tactics', 'team', 'type']]
tact
dataset looks like:tact
tactics | team | type | |
---|---|---|---|
0 | {'formation': 41212, 'lineup': [{'player': {'i... | Real Madrid | Starting XI |
1 | {'formation': 433, 'lineup': [{'player': {'id'... | Liverpool | Starting XI |
3489 | {'formation': 433, 'lineup': [{'player': {'id'... | Liverpool | Tactical Shift |
3490 | {'formation': 433, 'lineup': [{'player': {'id'... | Real Madrid | Tactical Shift |
3491 | {'formation': 433, 'lineup': [{'player': {'id'... | Real Madrid | Tactical Shift |
type
column in tact
, we see that they are set as 'Starting XI'
, one for each team. Let us separately fetch the data for the teams, filtering by type
tact = tact[tact['type'] == 'Starting XI']
tact_Real = tact[tact['team'] == 'Real Madrid']
tact_Liv = tact[tact['team'] == 'Liverpool']
tact_Real = tact_Real['tactics']
tact_Liv = tact_Liv['tactics']
tact_Real
and tact_Liv
are dataframes made of single rows with their indices (Which we will use to extract the data), and the tactics
column is made up of a Python dict
object. For now we are only interested in the key 'lineup'
to get all the information about the players from the teams. dict_Real = tact_Real[0]['lineup']
dict_Liv = tact_Liv[1]['lineup']
from_dict()
function provided by pandas
to convert the dictionary into a dataframe.lineup_Real = pd.DataFrame.from_dict(dict_Real)
lineup_Real
player | position | jersey_number | |
---|---|---|---|
0 | {'id': 5597, 'name': 'Keylor Navas Gamboa'} | {'id': 1, 'name': 'Goalkeeper'} | 1 |
1 | {'id': 5721, 'name': 'Daniel Carvajal Ramos'} | {'id': 2, 'name': 'Right Back'} | 2 |
2 | {'id': 5485, 'name': 'Raphaël Varane'} | {'id': 3, 'name': 'Right Center Back'} | 5 |
3 | {'id': 5201, 'name': 'Sergio Ramos García'} | {'id': 5, 'name': 'Left Center Back'} | 4 |
4 | {'id': 5552, 'name': 'Marcelo Vieira da Silva ... | {'id': 6, 'name': 'Left Back'} | 12 |
5 | {'id': 5539, 'name': 'Carlos Henrique Casimiro'} | {'id': 10, 'name': 'Center Defensive Midfield'} | 14 |
6 | {'id': 5463, 'name': 'Luka Modrić'} | {'id': 13, 'name': 'Right Center Midfield'} | 10 |
7 | {'id': 5574, 'name': 'Toni Kroos'} | {'id': 15, 'name': 'Left Center Midfield'} | 8 |
8 | {'id': 4926, 'name': 'Francisco Román Alarcón ... | {'id': 19, 'name': 'Center Attacking Midfield'} | 22 |
9 | {'id': 19677, 'name': 'Karim Benzema'} | {'id': 22, 'name': 'Right Center Forward'} | 9 |
10 | {'id': 5207, 'name': 'Cristiano Ronaldo dos Sa... | {'id': 24, 'name': 'Left Center Forward'} | 7 |
lineup_Liv = pd.DataFrame.from_dict(dict_Liv)
lineup_Liv
player | position | jersey_number | |
---|---|---|---|
0 | {'id': 3630, 'name': 'Loris Karius'} | {'id': 1, 'name': 'Goalkeeper'} | 1 |
1 | {'id': 3664, 'name': 'Trent Alexander-Arnold'} | {'id': 2, 'name': 'Right Back'} | 66 |
2 | {'id': 3471, 'name': 'Dejan Lovren'} | {'id': 3, 'name': 'Right Center Back'} | 6 |
3 | {'id': 3669, 'name': 'Virgil van Dijk'} | {'id': 5, 'name': 'Left Center Back'} | 4 |
4 | {'id': 3655, 'name': 'Andrew Robertson'} | {'id': 6, 'name': 'Left Back'} | 26 |
5 | {'id': 3532, 'name': 'Jordan Brian Henderson'} | {'id': 10, 'name': 'Center Defensive Midfield'} | 14 |
6 | {'id': 3567, 'name': 'Georginio Wijnaldum'} | {'id': 13, 'name': 'Right Center Midfield'} | 5 |
7 | {'id': 3473, 'name': 'James Philip Milner'} | {'id': 15, 'name': 'Left Center Midfield'} | 7 |
8 | {'id': 3531, 'name': 'Mohamed Salah'} | {'id': 17, 'name': 'Right Wing'} | 11 |
9 | {'id': 3629, 'name': 'Sadio Mané'} | {'id': 21, 'name': 'Left Wing'} | 19 |
10 | {'id': 3535, 'name': 'Roberto Firmino Barbosa ... | {'id': 23, 'name': 'Center Forward'} | 9 |
players_Real = {}
for i in range(len(lineup_Real)):
key = lineup_Real.player[i]['name']
val = lineup_Real.jersey_number[i]
players_Real[key] = str(val)
print(players_Real)
{'Keylor Navas Gamboa': '1', 'Daniel Carvajal Ramos': '2', 'Raphaël Varane': '5', 'Sergio Ramos García': '4', 'Marcelo Vieira da Silva Júnior': '12', 'Carlos Henrique Casimiro': '14', 'Luka Modrić': '10', 'Toni Kroos': '8', 'Francisco Román Alarcón Suárez': '22', 'Karim Benzema': '9', 'Cristiano Ronaldo dos Santos Aveiro': '7'}
players_Liv = {}
for i in range(len(lineup_Liv)):
key = lineup_Liv.player[i]['name']
val = lineup_Liv.jersey_number[i]
players_Liv[key] = str(val)
print(players_Liv)
{'Loris Karius': '1', 'Trent Alexander-Arnold': '66', 'Dejan Lovren': '6', 'Virgil van Dijk': '4', 'Andrew Robertson': '26', 'Jordan Brian Henderson': '14', 'Georginio Wijnaldum': '5', 'James Philip Milner': '7', 'Mohamed Salah': '11', 'Sadio Mané': '19', 'Roberto Firmino Barbosa de Oliveira': '9'}
So, we have collected the names and the jersey number of the players (starting 11) from both the teams in separate dictionaries named players_Real
and players_Liv
. These will come handy later!
Now from the events
dataset we will extract out the relevant columns for our pass network analysis purposes.
events_pn = events[['minute', 'second', 'team', 'type', 'location', 'pass_end_location', 'pass_outcome', 'player']]
events_pn
dataframe:events_pn.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | Real Madrid | Starting XI | NaN | NaN | NaN | NaN |
1 | 0 | 0 | Liverpool | Starting XI | NaN | NaN | NaN | NaN |
2 | 0 | 0 | Real Madrid | Half Start | NaN | NaN | NaN | NaN |
3 | 0 | 0 | Liverpool | Half Start | NaN | NaN | NaN | NaN |
4 | 45 | 0 | Liverpool | Half Start | NaN | NaN | NaN | NaN |
5 | 45 | 0 | Real Madrid | Half Start | NaN | NaN | NaN | NaN |
6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner |
7 | 0 | 3 | Liverpool | Pass | [35.0, 40.8] | [92.7, 22.7] | Incomplete | Dejan Lovren |
8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane |
9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić |
events_pn
dataframe:events_pn.tail(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
3487 | 82 | 27 | Liverpool | Substitution | NaN | NaN | NaN | James Philip Milner |
3488 | 88 | 21 | Real Madrid | Substitution | NaN | NaN | NaN | Karim Benzema |
3489 | 31 | 41 | Liverpool | Tactical Shift | NaN | NaN | NaN | NaN |
3490 | 61 | 1 | Real Madrid | Tactical Shift | NaN | NaN | NaN | NaN |
3491 | 88 | 34 | Real Madrid | Tactical Shift | NaN | NaN | NaN | NaN |
3492 | 42 | 21 | Real Madrid | Offside | [114.8, 41.4] | NaN | NaN | Karim Benzema |
3493 | 48 | 31 | Real Madrid | Half End | NaN | NaN | NaN | NaN |
3494 | 48 | 31 | Liverpool | Half End | NaN | NaN | NaN | NaN |
3495 | 93 | 2 | Liverpool | Half End | NaN | NaN | NaN | NaN |
3496 | 93 | 2 | Real Madrid | Half End | NaN | NaN | NaN | NaN |
events_Real = events_pn[events_pn['team'] == 'Real Madrid']
events_Liv = events_pn[events_pn['team'] == 'Liverpool']
View the first 10 rows from both the datasets:
events_Real.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | Real Madrid | Starting XI | NaN | NaN | NaN | NaN |
2 | 0 | 0 | Real Madrid | Half Start | NaN | NaN | NaN | NaN |
5 | 45 | 0 | Real Madrid | Half Start | NaN | NaN | NaN | NaN |
8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane |
9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić |
10 | 0 | 11 | Real Madrid | Pass | [22.3, 76.6] | [33.4, 68.0] | NaN | Daniel Carvajal Ramos |
11 | 0 | 15 | Real Madrid | Pass | [36.2, 75.3] | [43.6, 62.0] | Incomplete | Carlos Henrique Casimiro |
16 | 0 | 25 | Real Madrid | Pass | [14.7, 23.2] | [56.7, 6.2] | Incomplete | Sergio Ramos García |
17 | 0 | 40 | Real Madrid | Pass | [57.5, 4.6] | [49.2, 15.6] | NaN | Marcelo Vieira da Silva Júnior |
18 | 0 | 43 | Real Madrid | Pass | [48.8, 18.4] | [49.8, 12.5] | NaN | Carlos Henrique Casimiro |
events_Liv.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | Liverpool | Starting XI | NaN | NaN | NaN | NaN |
3 | 0 | 0 | Liverpool | Half Start | NaN | NaN | NaN | NaN |
4 | 45 | 0 | Liverpool | Half Start | NaN | NaN | NaN | NaN |
6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner |
7 | 0 | 3 | Liverpool | Pass | [35.0, 40.8] | [92.7, 22.7] | Incomplete | Dejan Lovren |
12 | 0 | 16 | Liverpool | Pass | [76.5, 18.1] | [84.8, 9.5] | NaN | Jordan Brian Henderson |
13 | 0 | 18 | Liverpool | Pass | [84.4, 10.0] | [92.5, 19.1] | NaN | Sadio Mané |
14 | 0 | 19 | Liverpool | Pass | [91.6, 21.3] | [90.6, 50.7] | NaN | Roberto Firmino Barbosa de Oliveira |
15 | 0 | 22 | Liverpool | Pass | [92.2, 50.9] | [109.7, 46.4] | Incomplete | Mohamed Salah |
25 | 1 | 7 | Liverpool | Pass | [42.0, 75.9] | [115.6, 59.3] | Incomplete | Trent Alexander-Arnold |
type
is set to Pass
.events_pn_Real = events_Real[events_Real['type'] == 'Pass']
events_pn_Liv = events_Liv[events_Liv['type'] == 'Pass']
events_pn_Real.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane |
9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić |
10 | 0 | 11 | Real Madrid | Pass | [22.3, 76.6] | [33.4, 68.0] | NaN | Daniel Carvajal Ramos |
11 | 0 | 15 | Real Madrid | Pass | [36.2, 75.3] | [43.6, 62.0] | Incomplete | Carlos Henrique Casimiro |
16 | 0 | 25 | Real Madrid | Pass | [14.7, 23.2] | [56.7, 6.2] | Incomplete | Sergio Ramos García |
17 | 0 | 40 | Real Madrid | Pass | [57.5, 4.6] | [49.2, 15.6] | NaN | Marcelo Vieira da Silva Júnior |
18 | 0 | 43 | Real Madrid | Pass | [48.8, 18.4] | [49.8, 12.5] | NaN | Carlos Henrique Casimiro |
19 | 0 | 46 | Real Madrid | Pass | [48.8, 13.9] | [36.1, 56.3] | NaN | Toni Kroos |
20 | 0 | 52 | Real Madrid | Pass | [41.3, 54.8] | [34.4, 40.2] | NaN | Raphaël Varane |
21 | 0 | 55 | Real Madrid | Pass | [39.1, 36.5] | [65.4, 13.1] | NaN | Sergio Ramos García |
events_pn_Liv.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner |
7 | 0 | 3 | Liverpool | Pass | [35.0, 40.8] | [92.7, 22.7] | Incomplete | Dejan Lovren |
12 | 0 | 16 | Liverpool | Pass | [76.5, 18.1] | [84.8, 9.5] | NaN | Jordan Brian Henderson |
13 | 0 | 18 | Liverpool | Pass | [84.4, 10.0] | [92.5, 19.1] | NaN | Sadio Mané |
14 | 0 | 19 | Liverpool | Pass | [91.6, 21.3] | [90.6, 50.7] | NaN | Roberto Firmino Barbosa de Oliveira |
15 | 0 | 22 | Liverpool | Pass | [92.2, 50.9] | [109.7, 46.4] | Incomplete | Mohamed Salah |
25 | 1 | 7 | Liverpool | Pass | [42.0, 75.9] | [115.6, 59.3] | Incomplete | Trent Alexander-Arnold |
37 | 2 | 0 | Liverpool | Pass | [9.9, 39.1] | [28.1, 4.2] | NaN | Virgil van Dijk |
38 | 2 | 3 | Liverpool | Pass | [43.2, 2.8] | [50.1, 4.8] | Incomplete | Andrew Robertson |
39 | 2 | 7 | Liverpool | Pass | [53.2, 0.1] | [50.0, 4.0] | NaN | Andrew Robertson |
events_pn_Real
dataset, we are focusing on the second and the third row (index 1
and 2
). Luka Modrić
makes the pass at around 0
th minute
and 10
th second
(Second row) and Daniel Carvajal Ramos
receives the pass at around 0
th minute
and 11
th second
(third row). So in both the datasets we need to add two extra columns named as pass_maker
and pass_receiver
, where pass_maker
column would be similar to player
column and the pass_receiver
column would be the player
column whose index would be shifted by one place in the negative direction. This can be achieved by the shift()
function provided by pandas
. We will perform this operation on both events_pn_Real
and events_pn_Liv
.events_pn_Real['pass_maker'] = events_pn_Real['player']
events_pn_Real['pass_receiver'] = events_pn_Real['player'].shift(-1)
events_pn_Liv['pass_maker'] = events_pn_Liv['player']
events_pn_Liv['pass_receiver'] = events_pn_Liv['player'].shift(-1)
events_pn_Real.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|
8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane | Raphaël Varane | Luka Modrić |
9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić | Luka Modrić | Daniel Carvajal Ramos |
10 | 0 | 11 | Real Madrid | Pass | [22.3, 76.6] | [33.4, 68.0] | NaN | Daniel Carvajal Ramos | Daniel Carvajal Ramos | Carlos Henrique Casimiro |
11 | 0 | 15 | Real Madrid | Pass | [36.2, 75.3] | [43.6, 62.0] | Incomplete | Carlos Henrique Casimiro | Carlos Henrique Casimiro | Sergio Ramos García |
16 | 0 | 25 | Real Madrid | Pass | [14.7, 23.2] | [56.7, 6.2] | Incomplete | Sergio Ramos García | Sergio Ramos García | Marcelo Vieira da Silva Júnior |
17 | 0 | 40 | Real Madrid | Pass | [57.5, 4.6] | [49.2, 15.6] | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Carlos Henrique Casimiro |
18 | 0 | 43 | Real Madrid | Pass | [48.8, 18.4] | [49.8, 12.5] | NaN | Carlos Henrique Casimiro | Carlos Henrique Casimiro | Toni Kroos |
19 | 0 | 46 | Real Madrid | Pass | [48.8, 13.9] | [36.1, 56.3] | NaN | Toni Kroos | Toni Kroos | Raphaël Varane |
20 | 0 | 52 | Real Madrid | Pass | [41.3, 54.8] | [34.4, 40.2] | NaN | Raphaël Varane | Raphaël Varane | Sergio Ramos García |
21 | 0 | 55 | Real Madrid | Pass | [39.1, 36.5] | [65.4, 13.1] | NaN | Sergio Ramos García | Sergio Ramos García | Cristiano Ronaldo dos Santos Aveiro |
events_pn_Liv.head(10)
minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|
6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner | James Philip Milner | Dejan Lovren |
7 | 0 | 3 | Liverpool | Pass | [35.0, 40.8] | [92.7, 22.7] | Incomplete | Dejan Lovren | Dejan Lovren | Jordan Brian Henderson |
12 | 0 | 16 | Liverpool | Pass | [76.5, 18.1] | [84.8, 9.5] | NaN | Jordan Brian Henderson | Jordan Brian Henderson | Sadio Mané |
13 | 0 | 18 | Liverpool | Pass | [84.4, 10.0] | [92.5, 19.1] | NaN | Sadio Mané | Sadio Mané | Roberto Firmino Barbosa de Oliveira |
14 | 0 | 19 | Liverpool | Pass | [91.6, 21.3] | [90.6, 50.7] | NaN | Roberto Firmino Barbosa de Oliveira | Roberto Firmino Barbosa de Oliveira | Mohamed Salah |
15 | 0 | 22 | Liverpool | Pass | [92.2, 50.9] | [109.7, 46.4] | Incomplete | Mohamed Salah | Mohamed Salah | Trent Alexander-Arnold |
25 | 1 | 7 | Liverpool | Pass | [42.0, 75.9] | [115.6, 59.3] | Incomplete | Trent Alexander-Arnold | Trent Alexander-Arnold | Virgil van Dijk |
37 | 2 | 0 | Liverpool | Pass | [9.9, 39.1] | [28.1, 4.2] | NaN | Virgil van Dijk | Virgil van Dijk | Andrew Robertson |
38 | 2 | 3 | Liverpool | Pass | [43.2, 2.8] | [50.1, 4.8] | Incomplete | Andrew Robertson | Andrew Robertson | Andrew Robertson |
39 | 2 | 7 | Liverpool | Pass | [53.2, 0.1] | [50.0, 4.0] | NaN | Andrew Robertson | Andrew Robertson | James Philip Milner |
pass_outcome
are set as nan
are actually the successful passes. We will again filter the datasets by successful passes:events_pn_Real = events_pn_Real[events_pn_Real['pass_outcome'].isnull() == True].reset_index()
events_pn_Liv = events_pn_Liv[events_pn_Liv['pass_outcome'].isnull() == True].reset_index()
events_pn_Real.head(10)
index | minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane | Raphaël Varane | Luka Modrić |
1 | 9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić | Luka Modrić | Daniel Carvajal Ramos |
2 | 10 | 0 | 11 | Real Madrid | Pass | [22.3, 76.6] | [33.4, 68.0] | NaN | Daniel Carvajal Ramos | Daniel Carvajal Ramos | Carlos Henrique Casimiro |
3 | 17 | 0 | 40 | Real Madrid | Pass | [57.5, 4.6] | [49.2, 15.6] | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Carlos Henrique Casimiro |
4 | 18 | 0 | 43 | Real Madrid | Pass | [48.8, 18.4] | [49.8, 12.5] | NaN | Carlos Henrique Casimiro | Carlos Henrique Casimiro | Toni Kroos |
5 | 19 | 0 | 46 | Real Madrid | Pass | [48.8, 13.9] | [36.1, 56.3] | NaN | Toni Kroos | Toni Kroos | Raphaël Varane |
6 | 20 | 0 | 52 | Real Madrid | Pass | [41.3, 54.8] | [34.4, 40.2] | NaN | Raphaël Varane | Raphaël Varane | Sergio Ramos García |
7 | 21 | 0 | 55 | Real Madrid | Pass | [39.1, 36.5] | [65.4, 13.1] | NaN | Sergio Ramos García | Sergio Ramos García | Cristiano Ronaldo dos Santos Aveiro |
8 | 22 | 0 | 58 | Real Madrid | Pass | [64.5, 11.1] | [54.2, 5.6] | NaN | Cristiano Ronaldo dos Santos Aveiro | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior |
9 | 23 | 0 | 59 | Real Madrid | Pass | [55.3, 5.5] | [83.9, 4.3] | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Karim Benzema |
events_pn_Liv.head(10)
index | minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner | James Philip Milner | Dejan Lovren |
1 | 12 | 0 | 16 | Liverpool | Pass | [76.5, 18.1] | [84.8, 9.5] | NaN | Jordan Brian Henderson | Jordan Brian Henderson | Sadio Mané |
2 | 13 | 0 | 18 | Liverpool | Pass | [84.4, 10.0] | [92.5, 19.1] | NaN | Sadio Mané | Sadio Mané | Roberto Firmino Barbosa de Oliveira |
3 | 14 | 0 | 19 | Liverpool | Pass | [91.6, 21.3] | [90.6, 50.7] | NaN | Roberto Firmino Barbosa de Oliveira | Roberto Firmino Barbosa de Oliveira | Mohamed Salah |
4 | 37 | 2 | 0 | Liverpool | Pass | [9.9, 39.1] | [28.1, 4.2] | NaN | Virgil van Dijk | Virgil van Dijk | Andrew Robertson |
5 | 39 | 2 | 7 | Liverpool | Pass | [53.2, 0.1] | [50.0, 4.0] | NaN | Andrew Robertson | Andrew Robertson | James Philip Milner |
6 | 40 | 2 | 10 | Liverpool | Pass | [45.5, 4.0] | [27.4, 16.8] | NaN | James Philip Milner | James Philip Milner | Virgil van Dijk |
7 | 41 | 2 | 13 | Liverpool | Pass | [26.7, 19.6] | [27.8, 47.3] | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren |
8 | 42 | 2 | 16 | Liverpool | Pass | [28.0, 45.4] | [28.4, 21.4] | NaN | Dejan Lovren | Dejan Lovren | Virgil van Dijk |
9 | 43 | 2 | 19 | Liverpool | Pass | [30.4, 25.7] | [30.7, 52.9] | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren |
So it seems we have been able to logically clean and modify the datasets. Now we are only focused on building the pass network among the players who were in the starting 11 from both the teams. So we will discard out the rows which consist of pass events that took place after the first substitution for either of the teams. Let us find the minute
and second
of the first substitution for both Real Madrid
and Liverpool
.
Now, let us filter the datasets events_Real
and events_Liv
by setting the type
to be Substitution
. This will give us the information of when the first substitution had taken place for the teams.
substitution_Real = events_Real[events_Real['type'] == 'Substitution']
substitution_Liv = events_Liv[events_Liv['type'] == 'Substitution']
substitution_Real
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
3485 | 36 | 17 | Real Madrid | Substitution | NaN | NaN | NaN | Daniel Carvajal Ramos |
3486 | 60 | 56 | Real Madrid | Substitution | NaN | NaN | NaN | Francisco Román Alarcón Suárez |
3488 | 88 | 21 | Real Madrid | Substitution | NaN | NaN | NaN | Karim Benzema |
substitution_Liv
minute | second | team | type | location | pass_end_location | pass_outcome | player | |
---|---|---|---|---|---|---|---|---|
3484 | 29 | 39 | Liverpool | Substitution | NaN | NaN | NaN | Mohamed Salah |
3487 | 82 | 27 | Liverpool | Substitution | NaN | NaN | NaN | James Philip Milner |
Real Madrid
at the 36
th minute and 17
th second, whereas for Liverpool
it takes place around 29
th minute and 39
th second. Let us find these out by writing a small Python code:substitution_Real_minute = np.min(substitution_Real['minute'])
substitution_Real_minute_data = substitution_Real[substitution_Real['minute'] == substitution_Real_minute]
substitution_Real_second = np.min(substitution_Real_minute_data['second'])
print("minute =", substitution_Real_minute, "second =", substitution_Real_second)
minute = 36 second = 17
substitution_Liv_minute = np.min(substitution_Liv['minute'])
substitution_Liv_minute_data = substitution_Liv[substitution_Liv['minute'] == substitution_Liv_minute]
substitution_Liv_second = np.min(substitution_Liv_minute_data['second'])
print("minute = ", substitution_Liv_minute, "second = ", substitution_Liv_second)
minute = 29 second = 39
events_pn_Real = events_pn_Real[(events_pn_Real['minute'] <= substitution_Real_minute)]
events_pn_Liv = events_pn_Liv[(events_pn_Liv['minute'] <= substitution_Liv_minute)]
events_pn_Real.head(10)
index | minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 8 | 0 | 8 | Real Madrid | Pass | [27.4, 60.2] | [36.1, 71.6] | NaN | Raphaël Varane | Raphaël Varane | Luka Modrić |
1 | 9 | 0 | 10 | Real Madrid | Pass | [35.3, 75.4] | [22.4, 76.6] | NaN | Luka Modrić | Luka Modrić | Daniel Carvajal Ramos |
2 | 10 | 0 | 11 | Real Madrid | Pass | [22.3, 76.6] | [33.4, 68.0] | NaN | Daniel Carvajal Ramos | Daniel Carvajal Ramos | Carlos Henrique Casimiro |
3 | 17 | 0 | 40 | Real Madrid | Pass | [57.5, 4.6] | [49.2, 15.6] | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Carlos Henrique Casimiro |
4 | 18 | 0 | 43 | Real Madrid | Pass | [48.8, 18.4] | [49.8, 12.5] | NaN | Carlos Henrique Casimiro | Carlos Henrique Casimiro | Toni Kroos |
5 | 19 | 0 | 46 | Real Madrid | Pass | [48.8, 13.9] | [36.1, 56.3] | NaN | Toni Kroos | Toni Kroos | Raphaël Varane |
6 | 20 | 0 | 52 | Real Madrid | Pass | [41.3, 54.8] | [34.4, 40.2] | NaN | Raphaël Varane | Raphaël Varane | Sergio Ramos García |
7 | 21 | 0 | 55 | Real Madrid | Pass | [39.1, 36.5] | [65.4, 13.1] | NaN | Sergio Ramos García | Sergio Ramos García | Cristiano Ronaldo dos Santos Aveiro |
8 | 22 | 0 | 58 | Real Madrid | Pass | [64.5, 11.1] | [54.2, 5.6] | NaN | Cristiano Ronaldo dos Santos Aveiro | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior |
9 | 23 | 0 | 59 | Real Madrid | Pass | [55.3, 5.5] | [83.9, 4.3] | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Karim Benzema |
events_pn_Liv.head(10)
index | minute | second | team | type | location | pass_end_location | pass_outcome | player | pass_maker | pass_receiver | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6 | 0 | 0 | Liverpool | Pass | [60.0, 40.0] | [32.1, 41.2] | NaN | James Philip Milner | James Philip Milner | Dejan Lovren |
1 | 12 | 0 | 16 | Liverpool | Pass | [76.5, 18.1] | [84.8, 9.5] | NaN | Jordan Brian Henderson | Jordan Brian Henderson | Sadio Mané |
2 | 13 | 0 | 18 | Liverpool | Pass | [84.4, 10.0] | [92.5, 19.1] | NaN | Sadio Mané | Sadio Mané | Roberto Firmino Barbosa de Oliveira |
3 | 14 | 0 | 19 | Liverpool | Pass | [91.6, 21.3] | [90.6, 50.7] | NaN | Roberto Firmino Barbosa de Oliveira | Roberto Firmino Barbosa de Oliveira | Mohamed Salah |
4 | 37 | 2 | 0 | Liverpool | Pass | [9.9, 39.1] | [28.1, 4.2] | NaN | Virgil van Dijk | Virgil van Dijk | Andrew Robertson |
5 | 39 | 2 | 7 | Liverpool | Pass | [53.2, 0.1] | [50.0, 4.0] | NaN | Andrew Robertson | Andrew Robertson | James Philip Milner |
6 | 40 | 2 | 10 | Liverpool | Pass | [45.5, 4.0] | [27.4, 16.8] | NaN | James Philip Milner | James Philip Milner | Virgil van Dijk |
7 | 41 | 2 | 13 | Liverpool | Pass | [26.7, 19.6] | [27.8, 47.3] | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren |
8 | 42 | 2 | 16 | Liverpool | Pass | [28.0, 45.4] | [28.4, 21.4] | NaN | Dejan Lovren | Dejan Lovren | Virgil van Dijk |
9 | 43 | 2 | 19 | Liverpool | Pass | [30.4, 25.7] | [30.7, 52.9] | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren |
Now from the datasets, we will split the location
and the pass_end_location
columns into two columns each representing the coordinates and name them as pass_maker_x
, pass_maker_y
, pass_receiver_x
and pass_receiver_y
.
Let us manipulate the dataset for Real Madrid
first:
Loc = events_pn_Real['location']
Loc = pd.DataFrame(Loc.to_list(), columns=['pass_maker_x', 'pass_maker_y'])
Loc_end = events_pn_Real['pass_end_location']
Loc_end = pd.DataFrame(Loc_end.to_list(), columns=['pass_receiver_x', 'pass_receiver_y'])
events_pn_Real['pass_maker_x'] = Loc['pass_maker_x']
events_pn_Real['pass_maker_y'] = Loc['pass_maker_y']
events_pn_Real['pass_receiver_x'] = Loc_end['pass_receiver_x']
events_pn_Real['pass_receiver_y'] = Loc_end['pass_receiver_y']
events_pn_Real = events_pn_Real[['index', 'minute', 'second', 'team', 'type', 'pass_outcome',
'player', 'pass_maker', 'pass_receiver', 'pass_maker_x',
'pass_maker_y', 'pass_receiver_x', 'pass_receiver_y']]
events_pn_Real.head(10)
index | minute | second | team | type | pass_outcome | player | pass_maker | pass_receiver | pass_maker_x | pass_maker_y | pass_receiver_x | pass_receiver_y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 8 | 0 | 8 | Real Madrid | Pass | NaN | Raphaël Varane | Raphaël Varane | Luka Modrić | 27.4 | 60.2 | 36.1 | 71.6 |
1 | 9 | 0 | 10 | Real Madrid | Pass | NaN | Luka Modrić | Luka Modrić | Daniel Carvajal Ramos | 35.3 | 75.4 | 22.4 | 76.6 |
2 | 10 | 0 | 11 | Real Madrid | Pass | NaN | Daniel Carvajal Ramos | Daniel Carvajal Ramos | Carlos Henrique Casimiro | 22.3 | 76.6 | 33.4 | 68.0 |
3 | 17 | 0 | 40 | Real Madrid | Pass | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Carlos Henrique Casimiro | 57.5 | 4.6 | 49.2 | 15.6 |
4 | 18 | 0 | 43 | Real Madrid | Pass | NaN | Carlos Henrique Casimiro | Carlos Henrique Casimiro | Toni Kroos | 48.8 | 18.4 | 49.8 | 12.5 |
5 | 19 | 0 | 46 | Real Madrid | Pass | NaN | Toni Kroos | Toni Kroos | Raphaël Varane | 48.8 | 13.9 | 36.1 | 56.3 |
6 | 20 | 0 | 52 | Real Madrid | Pass | NaN | Raphaël Varane | Raphaël Varane | Sergio Ramos García | 41.3 | 54.8 | 34.4 | 40.2 |
7 | 21 | 0 | 55 | Real Madrid | Pass | NaN | Sergio Ramos García | Sergio Ramos García | Cristiano Ronaldo dos Santos Aveiro | 39.1 | 36.5 | 65.4 | 13.1 |
8 | 22 | 0 | 58 | Real Madrid | Pass | NaN | Cristiano Ronaldo dos Santos Aveiro | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior | 64.5 | 11.1 | 54.2 | 5.6 |
9 | 23 | 0 | 59 | Real Madrid | Pass | NaN | Marcelo Vieira da Silva Júnior | Marcelo Vieira da Silva Júnior | Karim Benzema | 55.3 | 5.5 | 83.9 | 4.3 |
Loc = events_pn_Liv['location']
Loc = pd.DataFrame(Loc.to_list(), columns=['pass_maker_x', 'pass_maker_y'])
Loc_end = events_pn_Liv['pass_end_location']
Loc_end = pd.DataFrame(Loc_end.to_list(), columns=['pass_receiver_x', 'pass_receiver_y'])
events_pn_Liv['pass_maker_x'] = Loc['pass_maker_x']
events_pn_Liv['pass_maker_y'] = Loc['pass_maker_y']
events_pn_Liv['pass_receiver_x'] = Loc_end['pass_receiver_x']
events_pn_Liv['pass_receiver_y'] = Loc_end['pass_receiver_y']
events_pn_Liv = events_pn_Liv[['index', 'minute', 'second', 'team', 'type', 'pass_outcome',
'player', 'pass_maker', 'pass_receiver', 'pass_maker_x',
'pass_maker_y', 'pass_receiver_x', 'pass_receiver_y']]
events_pn_Liv.head(10)
index | minute | second | team | type | pass_outcome | player | pass_maker | pass_receiver | pass_maker_x | pass_maker_y | pass_receiver_x | pass_receiver_y | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 6 | 0 | 0 | Liverpool | Pass | NaN | James Philip Milner | James Philip Milner | Dejan Lovren | 60.0 | 40.0 | 32.1 | 41.2 |
1 | 12 | 0 | 16 | Liverpool | Pass | NaN | Jordan Brian Henderson | Jordan Brian Henderson | Sadio Mané | 76.5 | 18.1 | 84.8 | 9.5 |
2 | 13 | 0 | 18 | Liverpool | Pass | NaN | Sadio Mané | Sadio Mané | Roberto Firmino Barbosa de Oliveira | 84.4 | 10.0 | 92.5 | 19.1 |
3 | 14 | 0 | 19 | Liverpool | Pass | NaN | Roberto Firmino Barbosa de Oliveira | Roberto Firmino Barbosa de Oliveira | Mohamed Salah | 91.6 | 21.3 | 90.6 | 50.7 |
4 | 37 | 2 | 0 | Liverpool | Pass | NaN | Virgil van Dijk | Virgil van Dijk | Andrew Robertson | 9.9 | 39.1 | 28.1 | 4.2 |
5 | 39 | 2 | 7 | Liverpool | Pass | NaN | Andrew Robertson | Andrew Robertson | James Philip Milner | 53.2 | 0.1 | 50.0 | 4.0 |
6 | 40 | 2 | 10 | Liverpool | Pass | NaN | James Philip Milner | James Philip Milner | Virgil van Dijk | 45.5 | 4.0 | 27.4 | 16.8 |
7 | 41 | 2 | 13 | Liverpool | Pass | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren | 26.7 | 19.6 | 27.8 | 47.3 |
8 | 42 | 2 | 16 | Liverpool | Pass | NaN | Dejan Lovren | Dejan Lovren | Virgil van Dijk | 28.0 | 45.4 | 28.4 | 21.4 |
9 | 43 | 2 | 19 | Liverpool | Pass | NaN | Virgil van Dijk | Virgil van Dijk | Dejan Lovren | 30.4 | 25.7 | 30.7 | 52.9 |
av_loc_Real = events_pn_Real.groupby('pass_maker').agg({'pass_maker_x':['mean'],
'pass_maker_y':['mean', 'count']})
av_loc_Real
pass_maker_x | pass_maker_y | ||
---|---|---|---|
mean | mean | count | |
pass_maker | |||
Carlos Henrique Casimiro | 60.845455 | 31.836364 | 11 |
Cristiano Ronaldo dos Santos Aveiro | 81.580000 | 29.160000 | 10 |
Daniel Carvajal Ramos | 64.341667 | 73.875000 | 24 |
Francisco Román Alarcón Suárez | 62.323529 | 27.082353 | 17 |
Karim Benzema | 65.081818 | 27.936364 | 11 |
Keylor Navas Gamboa | 10.870000 | 41.810000 | 10 |
Luka Modrić | 60.604762 | 55.028571 | 21 |
Marcelo Vieira da Silva Júnior | 59.865217 | 11.130435 | 23 |
Raphaël Varane | 37.436364 | 58.354545 | 22 |
Sergio Ramos García | 41.282353 | 24.514706 | 34 |
Toni Kroos | 51.190000 | 24.275000 | 40 |
groupby()
function from pandas
splits events_pn_Real
into groups indexed by the player names. Whereas, the agg()
function aggregates the data into the averages of the pass makers' locations and also counts the number of passes made by these players. Now refine the column names of av_loc_Real
:av_loc_Real.columns = ['pass_maker_x', 'pass_maker_y', 'count']
av_loc_Real
pass_maker_x | pass_maker_y | count | |
---|---|---|---|
pass_maker | |||
Carlos Henrique Casimiro | 60.845455 | 31.836364 | 11 |
Cristiano Ronaldo dos Santos Aveiro | 81.580000 | 29.160000 | 10 |
Daniel Carvajal Ramos | 64.341667 | 73.875000 | 24 |
Francisco Román Alarcón Suárez | 62.323529 | 27.082353 | 17 |
Karim Benzema | 65.081818 | 27.936364 | 11 |
Keylor Navas Gamboa | 10.870000 | 41.810000 | 10 |
Luka Modrić | 60.604762 | 55.028571 | 21 |
Marcelo Vieira da Silva Júnior | 59.865217 | 11.130435 | 23 |
Raphaël Varane | 37.436364 | 58.354545 | 22 |
Sergio Ramos García | 41.282353 | 24.514706 | 34 |
Toni Kroos | 51.190000 | 24.275000 | 40 |
Liverpool
:av_loc_Liv = events_pn_Liv.groupby('pass_maker').agg({'pass_maker_x':['mean'],
'pass_maker_y':['mean', 'count']})
av_loc_Liv.columns = ['pass_maker_x', 'pass_maker_y', 'count']
av_loc_Liv
pass_maker_x | pass_maker_y | count | |
---|---|---|---|
pass_maker | |||
Andrew Robertson | 59.815385 | 6.830769 | 13 |
Dejan Lovren | 41.690909 | 60.172727 | 11 |
Georginio Wijnaldum | 76.390909 | 28.518182 | 11 |
James Philip Milner | 72.353333 | 36.153333 | 15 |
Jordan Brian Henderson | 61.035294 | 37.152941 | 17 |
Loris Karius | 12.914286 | 40.385714 | 7 |
Mohamed Salah | 77.550000 | 64.710000 | 10 |
Roberto Firmino Barbosa de Oliveira | 78.250000 | 43.570000 | 10 |
Sadio Mané | 86.275000 | 22.075000 | 4 |
Trent Alexander-Arnold | 64.666667 | 72.550000 | 12 |
Virgil van Dijk | 43.366667 | 25.433333 | 9 |
A
to a player B
is not identical to a pass from player B
to player A
). We will use the groupby()
and the count()
function to count the number of rows where a unique player A
passed the ball to another unique player B
.pass_Real = events_pn_Real.groupby(['pass_maker', 'pass_receiver']).index.count().reset_index()
pass_Real.head(10)
pass_maker | pass_receiver | index | |
---|---|---|---|
0 | Carlos Henrique Casimiro | Daniel Carvajal Ramos | 1 |
1 | Carlos Henrique Casimiro | Luka Modrić | 1 |
2 | Carlos Henrique Casimiro | Marcelo Vieira da Silva Júnior | 1 |
3 | Carlos Henrique Casimiro | Raphaël Varane | 1 |
4 | Carlos Henrique Casimiro | Sergio Ramos García | 1 |
5 | Carlos Henrique Casimiro | Toni Kroos | 6 |
6 | Cristiano Ronaldo dos Santos Aveiro | Daniel Carvajal Ramos | 3 |
7 | Cristiano Ronaldo dos Santos Aveiro | Karim Benzema | 1 |
8 | Cristiano Ronaldo dos Santos Aveiro | Luka Modrić | 1 |
9 | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior | 4 |
pass_Liv = events_pn_Liv.groupby(['pass_maker', 'pass_receiver']).index.count().reset_index()
pass_Liv.head(10)
pass_maker | pass_receiver | index | |
---|---|---|---|
0 | Andrew Robertson | Andrew Robertson | 1 |
1 | Andrew Robertson | Georginio Wijnaldum | 3 |
2 | Andrew Robertson | James Philip Milner | 3 |
3 | Andrew Robertson | Jordan Brian Henderson | 2 |
4 | Andrew Robertson | Roberto Firmino Barbosa de Oliveira | 2 |
5 | Andrew Robertson | Virgil van Dijk | 2 |
6 | Dejan Lovren | James Philip Milner | 1 |
7 | Dejan Lovren | Jordan Brian Henderson | 1 |
8 | Dejan Lovren | Loris Karius | 2 |
9 | Dejan Lovren | Mohamed Salah | 1 |
index
column to number_of_passes
:pass_Real.rename(columns = {'index':'number_of_passes'}, inplace = True)
pass_Real.head(10)
pass_maker | pass_receiver | number_of_passes | |
---|---|---|---|
0 | Carlos Henrique Casimiro | Daniel Carvajal Ramos | 1 |
1 | Carlos Henrique Casimiro | Luka Modrić | 1 |
2 | Carlos Henrique Casimiro | Marcelo Vieira da Silva Júnior | 1 |
3 | Carlos Henrique Casimiro | Raphaël Varane | 1 |
4 | Carlos Henrique Casimiro | Sergio Ramos García | 1 |
5 | Carlos Henrique Casimiro | Toni Kroos | 6 |
6 | Cristiano Ronaldo dos Santos Aveiro | Daniel Carvajal Ramos | 3 |
7 | Cristiano Ronaldo dos Santos Aveiro | Karim Benzema | 1 |
8 | Cristiano Ronaldo dos Santos Aveiro | Luka Modrić | 1 |
9 | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior | 4 |
pass_Liv.rename(columns = {'index':'number_of_passes'}, inplace = True)
pass_Liv.head(10)
pass_maker | pass_receiver | number_of_passes | |
---|---|---|---|
0 | Andrew Robertson | Andrew Robertson | 1 |
1 | Andrew Robertson | Georginio Wijnaldum | 3 |
2 | Andrew Robertson | James Philip Milner | 3 |
3 | Andrew Robertson | Jordan Brian Henderson | 2 |
4 | Andrew Robertson | Roberto Firmino Barbosa de Oliveira | 2 |
5 | Andrew Robertson | Virgil van Dijk | 2 |
6 | Dejan Lovren | James Philip Milner | 1 |
7 | Dejan Lovren | Jordan Brian Henderson | 1 |
8 | Dejan Lovren | Loris Karius | 2 |
9 | Dejan Lovren | Mohamed Salah | 1 |
av_loc_Real
and pass_Real
, Let us identify the left and the right dataframes for performing the merge. Here, av_loc_Real
is the left dataframe and pass_Real
is the right. We will use the merge()
function from pandas
to carry out the merging operation. pass_Real = pass_Real.merge(av_loc_Real, left_on = 'pass_maker', right_index = True)
pass_Real.head(10)
pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | |
---|---|---|---|---|---|---|
0 | Carlos Henrique Casimiro | Daniel Carvajal Ramos | 1 | 60.845455 | 31.836364 | 11 |
1 | Carlos Henrique Casimiro | Luka Modrić | 1 | 60.845455 | 31.836364 | 11 |
2 | Carlos Henrique Casimiro | Marcelo Vieira da Silva Júnior | 1 | 60.845455 | 31.836364 | 11 |
3 | Carlos Henrique Casimiro | Raphaël Varane | 1 | 60.845455 | 31.836364 | 11 |
4 | Carlos Henrique Casimiro | Sergio Ramos García | 1 | 60.845455 | 31.836364 | 11 |
5 | Carlos Henrique Casimiro | Toni Kroos | 6 | 60.845455 | 31.836364 | 11 |
6 | Cristiano Ronaldo dos Santos Aveiro | Daniel Carvajal Ramos | 3 | 81.580000 | 29.160000 | 10 |
7 | Cristiano Ronaldo dos Santos Aveiro | Karim Benzema | 1 | 81.580000 | 29.160000 | 10 |
8 | Cristiano Ronaldo dos Santos Aveiro | Luka Modrić | 1 | 81.580000 | 29.160000 | 10 |
9 | Cristiano Ronaldo dos Santos Aveiro | Marcelo Vieira da Silva Júnior | 4 | 81.580000 | 29.160000 | 10 |
The left_on
argument specifies the column names to join our right dataframe on, and the right_index
argument decides whether to use the index from the right dataframe as the key for joining. Let us do the same operation for the other team:
pass_Liv = pass_Liv.merge(av_loc_Liv, left_on = 'pass_maker', right_index = True)
pass_Liv.head(10)
pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | |
---|---|---|---|---|---|---|
0 | Andrew Robertson | Andrew Robertson | 1 | 59.815385 | 6.830769 | 13 |
1 | Andrew Robertson | Georginio Wijnaldum | 3 | 59.815385 | 6.830769 | 13 |
2 | Andrew Robertson | James Philip Milner | 3 | 59.815385 | 6.830769 | 13 |
3 | Andrew Robertson | Jordan Brian Henderson | 2 | 59.815385 | 6.830769 | 13 |
4 | Andrew Robertson | Roberto Firmino Barbosa de Oliveira | 2 | 59.815385 | 6.830769 | 13 |
5 | Andrew Robertson | Virgil van Dijk | 2 | 59.815385 | 6.830769 | 13 |
6 | Dejan Lovren | James Philip Milner | 1 | 41.690909 | 60.172727 | 11 |
7 | Dejan Lovren | Jordan Brian Henderson | 1 | 41.690909 | 60.172727 | 11 |
8 | Dejan Lovren | Loris Karius | 2 | 41.690909 | 60.172727 | 11 |
9 | Dejan Lovren | Mohamed Salah | 1 | 41.690909 | 60.172727 | 11 |
pass_Real = pass_Real.merge(av_loc_Real, left_on = 'pass_receiver',
right_index = True, suffixes = ['', '_receipt'])
pass_Real.rename(columns = {'pass_maker_x_receipt':'pass_receiver_x',
'pass_maker_y_receipt':'pass_receiver_y',
'count_receipt':'number_of_passes_received'}, inplace = True)
pass_Real = pass_Real[pass_Real['pass_maker'] != pass_Real['pass_receiver']].reset_index()
pass_Real
index | pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | pass_receiver_x | pass_receiver_y | number_of_passes_received | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | Carlos Henrique Casimiro | Daniel Carvajal Ramos | 1 | 60.845455 | 31.836364 | 11 | 64.341667 | 73.875 | 24 |
1 | 6 | Cristiano Ronaldo dos Santos Aveiro | Daniel Carvajal Ramos | 3 | 81.580000 | 29.160000 | 10 | 64.341667 | 73.875 | 24 |
2 | 21 | Francisco Román Alarcón Suárez | Daniel Carvajal Ramos | 2 | 62.323529 | 27.082353 | 17 | 64.341667 | 73.875 | 24 |
3 | 29 | Karim Benzema | Daniel Carvajal Ramos | 2 | 65.081818 | 27.936364 | 11 | 64.341667 | 73.875 | 24 |
4 | 39 | Luka Modrić | Daniel Carvajal Ramos | 10 | 60.604762 | 55.028571 | 21 | 64.341667 | 73.875 | 24 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
73 | 16 | Daniel Carvajal Ramos | Keylor Navas Gamboa | 1 | 64.341667 | 73.875000 | 24 | 10.870000 | 41.810 | 10 |
74 | 30 | Karim Benzema | Keylor Navas Gamboa | 1 | 65.081818 | 27.936364 | 11 | 10.870000 | 41.810 | 10 |
75 | 57 | Raphaël Varane | Keylor Navas Gamboa | 2 | 37.436364 | 58.354545 | 22 | 10.870000 | 41.810 | 10 |
76 | 64 | Sergio Ramos García | Keylor Navas Gamboa | 1 | 41.282353 | 24.514706 | 34 | 10.870000 | 41.810 | 10 |
77 | 74 | Toni Kroos | Keylor Navas Gamboa | 1 | 51.190000 | 24.275000 | 40 | 10.870000 | 41.810 | 10 |
78 rows × 10 columns
pass_Liv = pass_Liv.merge(av_loc_Liv, left_on = 'pass_receiver',
right_index = True, suffixes = ['', '_receipt'])
pass_Liv.rename(columns = {'pass_maker_x_receipt':'pass_receiver_x',
'pass_maker_y_receipt':'pass_receiver_y',
'count_receipt':'number_of_passes_received'}, inplace = True)
pass_Liv = pass_Liv[pass_Liv['pass_maker'] != pass_Liv['pass_receiver']].reset_index()
pass_Liv
index | pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | pass_receiver_x | pass_receiver_y | number_of_passes_received | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 12 | Georginio Wijnaldum | Andrew Robertson | 4 | 76.390909 | 28.518182 | 11 | 59.815385 | 6.830769 | 13 |
1 | 18 | James Philip Milner | Andrew Robertson | 1 | 72.353333 | 36.153333 | 15 | 59.815385 | 6.830769 | 13 |
2 | 28 | Jordan Brian Henderson | Andrew Robertson | 1 | 61.035294 | 37.152941 | 17 | 59.815385 | 6.830769 | 13 |
3 | 36 | Loris Karius | Andrew Robertson | 1 | 12.914286 | 40.385714 | 7 | 59.815385 | 6.830769 | 13 |
4 | 54 | Trent Alexander-Arnold | Andrew Robertson | 1 | 64.666667 | 72.550000 | 12 | 59.815385 | 6.830769 | 13 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 55 | Trent Alexander-Arnold | Dejan Lovren | 1 | 64.666667 | 72.550000 | 12 | 41.690909 | 60.172727 | 11 |
60 | 61 | Virgil van Dijk | Dejan Lovren | 3 | 43.366667 | 25.433333 | 9 | 41.690909 | 60.172727 | 11 |
61 | 25 | James Philip Milner | Sadio Mané | 2 | 72.353333 | 36.153333 | 15 | 86.275000 | 22.075000 | 4 |
62 | 33 | Jordan Brian Henderson | Sadio Mané | 1 | 61.035294 | 37.152941 | 17 | 86.275000 | 22.075000 | 4 |
63 | 43 | Mohamed Salah | Sadio Mané | 1 | 77.550000 | 64.710000 | 10 | 86.275000 | 22.075000 | 4 |
64 rows × 10 columns
pass_Real_new = pass_Real.replace({"pass_maker": players_Real, "pass_receiver": players_Real})
pass_Real_new
index | pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | pass_receiver_x | pass_receiver_y | number_of_passes_received | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 14 | 2 | 1 | 60.845455 | 31.836364 | 11 | 64.341667 | 73.875 | 24 |
1 | 6 | 7 | 2 | 3 | 81.580000 | 29.160000 | 10 | 64.341667 | 73.875 | 24 |
2 | 21 | 22 | 2 | 2 | 62.323529 | 27.082353 | 17 | 64.341667 | 73.875 | 24 |
3 | 29 | 9 | 2 | 2 | 65.081818 | 27.936364 | 11 | 64.341667 | 73.875 | 24 |
4 | 39 | 10 | 2 | 10 | 60.604762 | 55.028571 | 21 | 64.341667 | 73.875 | 24 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
73 | 16 | 2 | 1 | 1 | 64.341667 | 73.875000 | 24 | 10.870000 | 41.810 | 10 |
74 | 30 | 9 | 1 | 1 | 65.081818 | 27.936364 | 11 | 10.870000 | 41.810 | 10 |
75 | 57 | 5 | 1 | 2 | 37.436364 | 58.354545 | 22 | 10.870000 | 41.810 | 10 |
76 | 64 | 4 | 1 | 1 | 41.282353 | 24.514706 | 34 | 10.870000 | 41.810 | 10 |
77 | 74 | 8 | 1 | 1 | 51.190000 | 24.275000 | 40 | 10.870000 | 41.810 | 10 |
78 rows × 10 columns
pass_Liv_new = pass_Liv.replace({"pass_maker": players_Liv, "pass_receiver": players_Liv})
pass_Liv_new
index | pass_maker | pass_receiver | number_of_passes | pass_maker_x | pass_maker_y | count | pass_receiver_x | pass_receiver_y | number_of_passes_received | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 12 | 5 | 26 | 4 | 76.390909 | 28.518182 | 11 | 59.815385 | 6.830769 | 13 |
1 | 18 | 7 | 26 | 1 | 72.353333 | 36.153333 | 15 | 59.815385 | 6.830769 | 13 |
2 | 28 | 14 | 26 | 1 | 61.035294 | 37.152941 | 17 | 59.815385 | 6.830769 | 13 |
3 | 36 | 1 | 26 | 1 | 12.914286 | 40.385714 | 7 | 59.815385 | 6.830769 | 13 |
4 | 54 | 66 | 26 | 1 | 64.666667 | 72.550000 | 12 | 59.815385 | 6.830769 | 13 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | 55 | 66 | 6 | 1 | 64.666667 | 72.550000 | 12 | 41.690909 | 60.172727 | 11 |
60 | 61 | 4 | 6 | 3 | 43.366667 | 25.433333 | 9 | 41.690909 | 60.172727 | 11 |
61 | 25 | 7 | 19 | 2 | 72.353333 | 36.153333 | 15 | 86.275000 | 22.075000 | 4 |
62 | 33 | 14 | 19 | 1 | 61.035294 | 37.152941 | 17 | 86.275000 | 22.075000 | 4 |
63 | 43 | 11 | 19 | 1 | 77.550000 | 64.710000 | 10 | 86.275000 | 22.075000 | 4 |
64 rows × 10 columns
pitch = Pitch(pitch_color='grass', goal_type = 'box', line_color='white', stripe = True,
constrained_layout=True, tight_layout=False)
fig, ax = pitch.draw()
arrows = pitch.arrows(pass_Real.pass_maker_x, pass_Real.pass_maker_y,
pass_Real.pass_receiver_x, pass_Real.pass_receiver_y, lw = 5,
color = 'black', zorder = 1, ax=ax)
nodes = pitch.scatter(av_loc_Real.pass_maker_x, av_loc_Real.pass_maker_y,
s=350, color = 'white', edgecolors='black', linewidth=1, alpha = 1, ax = ax)
for index, row in av_loc_Real.iterrows():
pitch.annotate(players_Real[row.name], xy=(row.pass_maker_x, row.pass_maker_y),
c ='black', va = 'center', ha = 'center', size = 10, ax = ax)
plt.title("Pass network for Real Madrid against Liverpool", size = 20)
plt.show()
pitch = Pitch(pitch_color='grass', goal_type = 'box', stripe = True,
line_color='white', constrained_layout=True, tight_layout=False)
fig, ax = pitch.draw()
arrows = pitch.arrows(120 - pass_Liv.pass_maker_x, pass_Liv.pass_maker_y,
120 - pass_Liv.pass_receiver_x, pass_Liv.pass_receiver_y, lw = 5,
color = 'black', zorder = 1, ax = ax)
nodes = pitch.scatter(120 - av_loc_Liv.pass_maker_x, av_loc_Liv.pass_maker_y,
s=350, color = 'red', edgecolors = 'black', linewidth=1, alpha = 1, ax = ax)
for index, row in av_loc_Liv.iterrows():
pitch.annotate(players_Liv[row.name], xy=(120 - row.pass_maker_x, row.pass_maker_y),
c ='black', va = 'center', ha = 'center', size = 10, ax = ax)
plt.title("Pass network for Liverpool against Real Madrid", size = 20)
plt.show()
Liverpool
's pass network visualization, we subtract the x coordinates from 120 just to reverse the x-axis.Now that we have been successful in correctly visualizing the pass networks of the teams involved in the game, we will now start analyzing our networks using metrics from the literature of complex network analysis.
Note that both of our networks are directed weighted graphs, with number of passes as the weight for a directed edge.
Let us first develop the isomorphic graph to the one we just visualized for Real Madrid
, but this time using the networkx
package. First we will use the relevant columns from the pass_Real_new
dataset:
pass_Real_new = pass_Real_new[['pass_maker', 'pass_receiver', 'number_of_passes']]
pass_Real_new
pass_maker | pass_receiver | number_of_passes | |
---|---|---|---|
0 | 14 | 2 | 1 |
1 | 7 | 2 | 3 |
2 | 22 | 2 | 2 |
3 | 9 | 2 | 2 |
4 | 10 | 2 | 10 |
... | ... | ... | ... |
73 | 2 | 1 | 1 |
74 | 9 | 1 | 1 |
75 | 5 | 1 | 2 |
76 | 4 | 1 | 1 |
77 | 8 | 1 | 1 |
78 rows × 3 columns
pass_Real_new
to a list of tuples, where each row is converted to a tuple. This is required for drawing a networkx
graph.L_Real = pass_Real_new.apply(tuple, axis=1).tolist()
print(L_Real)
[('14', '2', 1), ('7', '2', 3), ('22', '2', 2), ('9', '2', 2), ('10', '2', 10), ('12', '2', 2), ('5', '2', 3), ('4', '2', 3), ('8', '2', 1), ('14', '10', 1), ('7', '10', 1), ('2', '10', 7), ('22', '10', 1), ('12', '10', 1), ('5', '10', 5), ('4', '10', 2), ('8', '10', 5), ('14', '12', 1), ('7', '12', 4), ('22', '12', 2), ('1', '12', 2), ('10', '12', 1), ('4', '12', 9), ('8', '12', 4), ('14', '5', 1), ('2', '5', 5), ('1', '5', 2), ('10', '5', 3), ('12', '5', 2), ('4', '5', 5), ('8', '5', 4), ('14', '4', 1), ('7', '4', 1), ('22', '4', 5), ('9', '4', 1), ('1', '4', 4), ('10', '4', 1), ('12', '4', 2), ('5', '4', 6), ('8', '4', 10), ('14', '8', 6), ('2', '8', 1), ('22', '8', 4), ('9', '8', 4), ('1', '8', 1), ('10', '8', 4), ('12', '8', 5), ('5', '8', 4), ('4', '8', 9), ('7', '9', 1), ('2', '9', 1), ('22', '9', 1), ('1', '9', 1), ('10', '9', 1), ('12', '9', 3), ('5', '9', 1), ('8', '9', 2), ('2', '14', 2), ('9', '14', 2), ('10', '14', 1), ('12', '14', 2), ('5', '14', 1), ('8', '14', 2), ('2', '7', 2), ('22', '7', 2), ('9', '7', 1), ('12', '7', 2), ('4', '7', 1), ('8', '7', 2), ('2', '22', 3), ('12', '22', 4), ('4', '22', 4), ('8', '22', 8), ('2', '1', 1), ('9', '1', 1), ('5', '1', 2), ('4', '1', 1), ('8', '1', 1)]
G_Real = nx.DiGraph()
for i in range(len(L_Real)):
G_Real.add_edge(L_Real[i][0], L_Real[i][1], weight = L_Real[i][2])
edges_Real = G_Real.edges()
weights_Real = [G_Real[u][v]['weight'] for u, v in edges_Real]
nx.draw(G_Real, node_size=800, with_labels=True, node_color='white', width = weights_Real)
plt.gca().collections[0].set_edgecolor('black') # sets the edge color of the nodes to black
plt.title("Pass network for Real Madrid vs Liverpool", size = 20)
plt.show()
Liverpool
too, let us first clean the pass_Liv_new
dataset and then draw the isomorphic weighted directed graph:pass_Liv_new = pass_Liv_new[['pass_maker', 'pass_receiver', 'number_of_passes']]
pass_Liv_new
pass_maker | pass_receiver | number_of_passes | |
---|---|---|---|
0 | 5 | 26 | 4 |
1 | 7 | 26 | 1 |
2 | 14 | 26 | 1 |
3 | 1 | 26 | 1 |
4 | 66 | 26 | 1 |
... | ... | ... | ... |
59 | 66 | 6 | 1 |
60 | 4 | 6 | 3 |
61 | 7 | 19 | 2 |
62 | 14 | 19 | 1 |
63 | 11 | 19 | 1 |
64 rows × 3 columns
L_Liv = pass_Liv_new.apply(tuple, axis=1).tolist()
G_Liv = nx.DiGraph()
for i in range(len(L_Liv)):
G_Liv.add_edge(L_Liv[i][0], L_Liv[i][1], weight = L_Liv[i][2])
edges_Liv = G_Liv.edges()
weights_Liv = [G_Liv[u][v]['weight'] for u, v in edges_Liv]
nx.draw(G_Liv, node_size = 800, with_labels = True, node_color = 'red', width = weights_Liv)
plt.gca().collections[0].set_edgecolor('black') # sets the edge color of the nodes to black
plt.show()
Let us discuss some of the important functions from the networkx
package that we have employed for drawing graphs:
DiGraph()
function sets the base class for generating directed graphs,add_edge()
function adds an edge between two nodes given by the first two arguments and the weight
parameter sets the weight for this edgedraw()
function visualizes a networkx
graph and its parameters are self-explanatoryLet us now understand the degree, indegree and outdegree of a node from a directed weighted graph. Indegree of a node is the total number of edges that are directed towards the node, i.e, for our case, the total number of passes received by a player (node). Similarly, outdegree means the total number of edges that are directed outwards from the node, i.e, the total number of passes given by a player. Finally, the degree of a node is the total number of edges connected to a node (ignoring the directions of the edges), i.e, sum of the total number of passes given and the total number of passes received by a player. It is evident that the degree of a node is the sum of its indegree and outdegree.
We will use networkx
to find out the node degrees from the pass network of Real Madrid
.
# Prepare a dictionary with jersey numbers as the node ids,
# i.e, the dictionary keys and degrees as the dictionary values
deg_Real = dict(nx.degree(G_Real))
# convert a dictionary to a pandas dataframe
degree_Real = pd.DataFrame.from_dict(list(deg_Real.items()))
degree_Real.rename(columns = {0:'jersey_number', 1: 'node_degree'}, inplace = True)
degree_Real
jersey_number | node_degree | |
---|---|---|
0 | 14 | 12 |
1 | 2 | 17 |
2 | 7 | 11 |
3 | 22 | 11 |
4 | 9 | 14 |
5 | 10 | 15 |
6 | 12 | 16 |
7 | 5 | 14 |
8 | 4 | 17 |
9 | 8 | 19 |
10 | 1 | 10 |
Real Madrid
in that game, we notice that the player with jersey number 8
(i.e, Toni Kroos
) had the highest degree value of 19. On second are ranked the players with jersey number 2
and 4
with degree value 17, i.e, our favorite Spanish defenders 'Daniel Carvajal Ramos'
and 'Sergio Ramos García'
respectively. Tremendous! Let us use seaborn
to visualize the deg_Real
dictionary via histogram plot:X = list(deg_Real.keys())
Y = list(deg_Real.values())
sns.barplot(x = Y, y = X, palette = "magma")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("degree")
plt.title("Player pass degrees for Real Madrid vs Liverpool", size = 16)
plt.show()
Liverpool
too:# Prepare a dictionary with jersey numbers as the node ids,
# i.e, the dictionary keys and degrees as the dictionary values
deg_Liv = dict(nx.degree(G_Liv))
# convert a dictionary to a pandas dataframe
degree_Liv = pd.DataFrame.from_dict(list(deg_Liv.items()))
degree_Liv.rename(columns = {0:'jersey_number', 1: 'node_degree'}, inplace = True)
degree_Liv
jersey_number | node_degree | |
---|---|---|
0 | 5 | 12 |
1 | 26 | 11 |
2 | 7 | 17 |
3 | 14 | 17 |
4 | 1 | 7 |
5 | 66 | 13 |
6 | 4 | 12 |
7 | 11 | 11 |
8 | 6 | 12 |
9 | 9 | 10 |
10 | 19 | 6 |
14
and 7
, i,e 'Jordan Brian Henderson'
and 'James Philip Milner'
respectively. We will visualize the deg_Liv
dictionary via histogram plot:X = list(deg_Liv.keys())
Y = list(deg_Liv.values())
sns.barplot(x = Y, y = X, palette = "magma")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("degree")
plt.title("Player pass degrees for Liverpool vs Real Madrid", size = 16)
plt.show()
indeg_Real = dict(G_Real.in_degree())
indegree_Real = pd.DataFrame.from_dict(list(indeg_Real.items()))
indegree_Real.rename(columns = {0:'jersey_number', 1: 'node_indegree'}, inplace = True)
X = list(indeg_Real.keys())
Y = list(indeg_Real.values())
sns.barplot(x = Y, y = X, palette = "hls")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("indegree")
plt.title("Player pass indegrees for Real Madrid vs Liverpool", size = 16)
plt.show()
indeg_Liv = dict(G_Liv.in_degree())
indegree_Liv = pd.DataFrame.from_dict(list(indeg_Liv.items()))
indegree_Liv.rename(columns = {0:'jersey_number', 1: 'node_indegree'}, inplace = True)
X = list(indeg_Liv.keys())
Y = list(indeg_Liv.values())
sns.barplot(x = Y, y = X, palette = "hls")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("indegree")
plt.title("Player pass indegrees for Liverpool vs Real Madrid", size = 16)
plt.show()
outdeg_Real = dict(G_Real.out_degree())
outdegree_Real = pd.DataFrame.from_dict(list(outdeg_Real.items()))
outdegree_Real.rename(columns = {0:'jersey_number', 1: 'node_outdegree'}, inplace = True)
X = list(outdeg_Real.keys())
Y = list(outdeg_Real.values())
sns.barplot(x = Y, y = X, palette = "hls")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("outdegree")
plt.title("Player pass outdegrees for Real Madrid vs Liverpool", size = 16)
plt.show()
outdeg_Liv = dict(G_Liv.out_degree())
outdegree_Liv = pd.DataFrame.from_dict(list(outdeg_Liv.items()))
outdegree_Liv.rename(columns = {0:'jersey_number', 1: 'node_outdegree'}, inplace = True)
X = list(outdeg_Liv.keys())
Y = list(outdeg_Liv.values())
sns.barplot(x = Y, y = X, palette = "hls")
plt.xticks(range(0, max(Y)+5, 2))
plt.ylabel("Player Jersey number")
plt.xlabel("outdegree")
plt.title("Player pass outdegrees for Liverpool vs Real Madrid", size = 16)
plt.show()
G_Real
and G_Liv
graphs:A_Real = nx.adjacency_matrix(G_Real)
A_Liv = nx.adjacency_matrix(G_Liv)
A_Real = A_Real.todense()
A_Liv = A_Liv.todense()
sns.heatmap(A_Real, annot = True, cmap ='gnuplot')
plt.title("Adjacency matrix for Real Madrid's pass network")
plt.show()
sns.heatmap(A_Liv, annot = True, cmap ='gnuplot')
plt.title("Adjacency matrix for Liverpool's pass network")
plt.show()
r_Real = nx.degree_pearson_correlation_coefficient(G_Real, weight = 'weight')
r_Liv = nx.degree_pearson_correlation_coefficient(G_Liv, weight = 'weight')
print(r_Real, r_Liv)
-0.17983836432860179 -0.2412372196699064
'weight'
column in the pass network. Let us create a new graph for Real Madrid
:pass_Real_mod = pass_Real_new[['pass_maker', 'pass_receiver']]
pass_Real_mod['1/nop'] = 1/pass_Real_new['number_of_passes']
pass_Real_mod.head(5)
pass_maker | pass_receiver | 1/nop | |
---|---|---|---|
0 | 14 | 2 | 1.000000 |
1 | 7 | 2 | 0.333333 |
2 | 22 | 2 | 0.500000 |
3 | 9 | 2 | 0.500000 |
4 | 10 | 2 | 0.100000 |
L_Real_mod = pass_Real_mod.apply(tuple, axis=1).tolist()
G_Real_mod = nx.DiGraph()
for i in range(len(L_Real_mod)):
G_Real_mod.add_edge(L_Real_mod[i][0], L_Real_mod[i][1], weight = L_Real_mod[i][2])
edges_Real_mod = G_Real_mod.edges()
weights_Real_mod = [G_Real_mod[u][v]['weight'] for u, v in edges_Real_mod]
nx.draw(G_Real_mod, node_size=800, with_labels=True, node_color='white', width = weights_Real_mod)
plt.gca().collections[0].set_edgecolor('black')
plt.title("Modified pass network for Real Madrid vs Liverpool", size = 20)
plt.show()
Liverpool
too:pass_Liv_mod = pass_Liv_new[['pass_maker', 'pass_receiver']]
pass_Liv_mod['1/nop'] = 1/pass_Liv_new['number_of_passes']
pass_Liv_mod.head(5)
pass_maker | pass_receiver | 1/nop | |
---|---|---|---|
0 | 5 | 26 | 0.25 |
1 | 7 | 26 | 1.00 |
2 | 14 | 26 | 1.00 |
3 | 1 | 26 | 1.00 |
4 | 66 | 26 | 1.00 |
L_Liv_mod = pass_Liv_mod.apply(tuple, axis=1).tolist()
G_Liv_mod = nx.DiGraph()
for i in range(len(L_Liv_mod)):
G_Liv_mod.add_edge(L_Liv_mod[i][0], L_Liv_mod[i][1], weight = L_Liv_mod[i][2])
edges_Liv_mod = G_Liv_mod.edges()
weights_Liv_mod = [G_Liv_mod[u][v]['weight'] for u, v in edges_Liv_mod]
nx.draw(G_Liv_mod, node_size=800, with_labels=True, node_color='red', width = weights_Liv_mod)
plt.gca().collections[0].set_edgecolor('black')
plt.title("Modified pass network for Liverpool vs Real Madrid", size = 20)
plt.show()
Real Madrid
:dis_Real = nx.shortest_path(G_Real_mod, weight = 'weight')
print(dis_Real)
{'14': {'14': ['14'], '2': ['14', '8', '10', '2'], '10': ['14', '8', '10'], '12': ['14', '8', '4', '12'], '5': ['14', '8', '5'], '4': ['14', '8', '4'], '8': ['14', '8'], '9': ['14', '8', '9'], '7': ['14', '8', '7'], '22': ['14', '8', '22'], '1': ['14', '8', '5', '1']}, '2': {'2': ['2'], '10': ['2', '10'], '5': ['2', '5'], '8': ['2', '10', '8'], '9': ['2', '5', '4', '12', '9'], '14': ['2', '14'], '7': ['2', '7'], '22': ['2', '22'], '1': ['2', '5', '1'], '12': ['2', '5', '4', '12'], '4': ['2', '5', '4']}, '7': {'7': ['7'], '2': ['7', '2'], '10': ['7', '2', '10'], '12': ['7', '12'], '4': ['7', '12', '8', '4'], '9': ['7', '12', '9'], '5': ['7', '2', '5'], '8': ['7', '12', '8'], '14': ['7', '12', '14'], '22': ['7', '12', '22'], '1': ['7', '2', '5', '1']}, '22': {'22': ['22'], '2': ['22', '2'], '10': ['22', '8', '10'], '12': ['22', '4', '12'], '4': ['22', '4'], '8': ['22', '8'], '9': ['22', '4', '12', '9'], '7': ['22', '7'], '5': ['22', '4', '5'], '1': ['22', '4', '5', '1'], '14': ['22', '8', '14']}, '9': {'9': ['9'], '2': ['9', '2'], '4': ['9', '8', '4'], '8': ['9', '8'], '14': ['9', '14'], '7': ['9', '8', '7'], '1': ['9', '1'], '10': ['9', '8', '10'], '12': ['9', '8', '4', '12'], '5': ['9', '8', '5'], '22': ['9', '8', '22']}, '10': {'10': ['10'], '2': ['10', '2'], '12': ['10', '8', '4', '12'], '5': ['10', '2', '5'], '4': ['10', '8', '4'], '8': ['10', '8'], '9': ['10', '8', '9'], '14': ['10', '2', '14'], '7': ['10', '2', '7'], '22': ['10', '8', '22'], '1': ['10', '2', '5', '1']}, '12': {'12': ['12'], '2': ['12', '2'], '10': ['12', '8', '10'], '5': ['12', '8', '5'], '4': ['12', '8', '4'], '8': ['12', '8'], '9': ['12', '9'], '14': ['12', '14'], '7': ['12', '7'], '22': ['12', '22'], '1': ['12', '8', '5', '1']}, '5': {'5': ['5'], '2': ['5', '10', '2'], '10': ['5', '10'], '4': ['5', '4'], '8': ['5', '8'], '9': ['5', '4', '12', '9'], '14': ['5', '8', '14'], '1': ['5', '1'], '12': ['5', '4', '12'], '7': ['5', '8', '7'], '22': ['5', '8', '22']}, '4': {'4': ['4'], '2': ['4', '2'], '10': ['4', '8', '10'], '12': ['4', '12'], '5': ['4', '5'], '8': ['4', '8'], '7': ['4', '12', '7'], '22': ['4', '8', '22'], '1': ['4', '5', '1'], '9': ['4', '12', '9'], '14': ['4', '12', '14']}, '8': {'8': ['8'], '2': ['8', '10', '2'], '10': ['8', '10'], '12': ['8', '4', '12'], '5': ['8', '5'], '4': ['8', '4'], '9': ['8', '9'], '14': ['8', '14'], '7': ['8', '7'], '22': ['8', '22'], '1': ['8', '5', '1']}, '1': {'1': ['1'], '12': ['1', '4', '12'], '5': ['1', '4', '5'], '4': ['1', '4'], '8': ['1', '4', '8'], '9': ['1', '4', '12', '9'], '2': ['1', '4', '2'], '10': ['1', '4', '8', '10'], '7': ['1', '4', '12', '7'], '22': ['1', '4', '8', '22'], '14': ['1', '4', '12', '14']}}
'Keylor Navas Gamboa'
(jersey number 1
) to 'Cristiano Ronaldo dos Santos Aveiro'
(jersey number 7
). We will type the following:print(dis_Real['1']['7'])
['1', '4', '12', '7']
'Keylor Navas Gamboa'
(jersey: 1
), to 'Cristiano Ronaldo dos Santos Aveiro'
(jersey: 7
) was to pass the ball first to 'Sergio Ramos García'
(jersey: 4
) who would pass to 'Marcelo Vieira da Silva Júnior'
(jersey: 12
) with him ultimately passing to 'Cristiano Ronaldo dos Santos Aveiro'
. This seems like a good post-match analysis tool. I got this idea after discussing with Sarath Babu. Liverpool
:dis_Liv = nx.shortest_path(G_Liv_mod, weight = 'weight')
print(dis_Liv)
{'5': {'5': ['5'], '26': ['5', '26'], '7': ['5', '26', '7'], '14': ['5', '14'], '4': ['5', '4'], '11': ['5', '11'], '66': ['5', '26', '7', '66'], '9': ['5', '26', '9'], '1': ['5', '14', '1'], '6': ['5', '14', '6'], '19': ['5', '26', '7', '19']}, '26': {'26': ['26'], '5': ['26', '5'], '7': ['26', '7'], '14': ['26', '14'], '9': ['26', '9'], '4': ['26', '4'], '11': ['26', '9', '11'], '66': ['26', '7', '66'], '1': ['26', '14', '1'], '6': ['26', '14', '6'], '19': ['26', '7', '19']}, '7': {'7': ['7'], '26': ['7', '66', '5', '26'], '5': ['7', '66', '5'], '14': ['7', '14'], '9': ['7', '66', '9'], '4': ['7', '4'], '1': ['7', '1'], '11': ['7', '66', '11'], '66': ['7', '66'], '6': ['7', '14', '6'], '19': ['7', '19']}, '14': {'14': ['14'], '26': ['14', '5', '26'], '5': ['14', '5'], '7': ['14', '7'], '4': ['14', '4'], '1': ['14', '1'], '66': ['14', '7', '66'], '6': ['14', '6'], '19': ['14', '7', '19'], '11': ['14', '7', '66', '11'], '9': ['14', '5', '26', '9']}, '1': {'1': ['1'], '26': ['1', '26'], '14': ['1', '14'], '4': ['1', '6', '4'], '6': ['1', '6'], '7': ['1', '6', '7'], '11': ['1', '6', '66', '11'], '66': ['1', '6', '66'], '5': ['1', '6', '66', '5'], '9': ['1', '6', '66', '9'], '19': ['1', '6', '7', '19']}, '66': {'66': ['66'], '26': ['66', '5', '26'], '5': ['66', '5'], '14': ['66', '14'], '9': ['66', '9'], '11': ['66', '11'], '6': ['66', '14', '6'], '7': ['66', '14', '7'], '4': ['66', '5', '4'], '19': ['66', '11', '19'], '1': ['66', '14', '1']}, '4': {'4': ['4'], '26': ['4', '26'], '5': ['4', '26', '5'], '14': ['4', '26', '14'], '66': ['4', '6', '66'], '6': ['4', '6'], '7': ['4', '26', '7'], '9': ['4', '26', '9'], '1': ['4', '6', '1'], '11': ['4', '6', '66', '11'], '19': ['4', '26', '7', '19']}, '11': {'11': ['11'], '5': ['11', '66', '5'], '7': ['11', '9', '7'], '9': ['11', '9'], '4': ['11', '4'], '66': ['11', '66'], '19': ['11', '19'], '14': ['11', '9', '14'], '6': ['11', '9', '14', '6'], '26': ['11', '66', '5', '26'], '1': ['11', '9', '14', '1']}, '6': {'6': ['6'], '7': ['6', '7'], '14': ['6', '66', '14'], '4': ['6', '4'], '1': ['6', '1'], '11': ['6', '66', '11'], '66': ['6', '66'], '26': ['6', '4', '26'], '5': ['6', '66', '5'], '9': ['6', '66', '9'], '19': ['6', '7', '19']}, '9': {'9': ['9'], '7': ['9', '7'], '14': ['9', '14'], '11': ['9', '11'], '66': ['9', '11', '66'], '6': ['9', '14', '6'], '5': ['9', '14', '5'], '4': ['9', '14', '4'], '19': ['9', '7', '19'], '26': ['9', '14', '5', '26'], '1': ['9', '14', '1']}, '19': {'19': ['19'], '7': ['19', '7'], '14': ['19', '14'], '9': ['19', '9'], '11': ['19', '9', '11'], '66': ['19', '9', '11', '66'], '6': ['19', '14', '6'], '5': ['19', '14', '5'], '4': ['19', '14', '4'], '26': ['19', '14', '5', '26'], '1': ['19', '14', '1']}}
print(dis_Liv['1']['9'])
['1', '6', '66', '9']
p
tells us how far the furthest player node from p
is positioned in the pass network. Let us calculate the eccentricities for all the 11 nodes for Real Madrid
.E_Real = nx.eccentricity(G_Real_mod)
print(E_Real)
{'14': 2, '2': 2, '7': 2, '22': 2, '9': 2, '10': 2, '12': 2, '5': 2, '4': 2, '8': 1, '1': 2}
av_E_Real = sum(list(E_Real.values()))/len(E_Real)
print(av_E_Real)
1.9090909090909092
Liverpool
:E_Liv = nx.eccentricity(G_Liv_mod)
print(E_Liv)
{'5': 2, '26': 2, '7': 1, '14': 2, '1': 2, '66': 2, '4': 2, '11': 2, '6': 2, '9': 2, '19': 2}
av_E_Liv = sum(list(E_Liv.values()))/len(E_Liv)
print(av_E_Liv)
1.9090909090909092
G_Real
(note that this graph should not be the modified version)cc_Real = nx.average_clustering(G_Real, weight = 'weight')
print(cc_Real)
0.182334851979709
Liverpool
:cc_Liv = nx.average_clustering(G_Liv, weight = 'weight')
print(cc_Liv)
0.27664278424505534
Real Madrid
's pass network stating the fact that a lesser number of players passed the ball among each other, compared to that of Liverpool
.centrality
(especially the betweenness centrality
) for each node in either team's pass network and understand which player was the most important in their pass network. For Real Madrid
:bc_Real = nx.betweenness_centrality(G_Real, weight = 'weight')
print(bc_Real)
{'14': 0.15222222222222223, '2': 0.10685185185185186, '7': 0.05592592592592593, '22': 0.0, '9': 0.14462962962962964, '10': 0.12407407407407407, '12': 0.009259259259259259, '5': 0.007407407407407408, '4': 0.06851851851851852, '8': 0.031481481481481485, '1': 0.11703703703703704}
max_bc_Real = max(bc_Real, key = bc_Real.get)
print(max_bc_Real)
14
Liverpool
:bc_Liv = nx.betweenness_centrality(G_Liv, weight = 'weight')
print(bc_Liv)
max_bc_Liv = max(bc_Liv, key = bc_Liv.get)
print(max_bc_Liv)
{'5': 0.06296296296296296, '26': 0.016666666666666666, '7': 0.2453703703703704, '14': 0.12407407407407407, '1': 0.002777777777777778, '66': 0.075, '4': 0.07222222222222222, '11': 0.05555555555555556, '6': 0.1259259259259259, '9': 0.021296296296296296, '19': 0.03888888888888889} 7
'Carlos Henrique Casimiro'
(jersey: 4
) from Real Madrid
and 'James Philip Milner'
(jersey: 7) from Liverpool
. We have been able to compute some interesting results using complex network analysis on our pass networks. This completes my presentation. 😌😌😌😌😌😌😌😌😌